home *** CD-ROM | disk | FTP | other *** search
/ IRIX Base Documentation 2002 November / SGI IRIX Base Documentation 2002 November.iso / usr / share / catman / p_man / cat3 / SCSL / dpsldu.z / dpsldu
Encoding:
Text File  |  2002-10-03  |  37.9 KB  |  529 lines

  1.  
  2.  
  3.  
  4. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  5.  
  6.  
  7.  
  8. NNNNAAAAMMMMEEEE
  9.      DDDDPPPPSSSSLLLLDDDDUUUU____DDDDeeeessssttttrrrrooooyyyy, DDDDPPPPSSSSLLLLDDDDUUUU____EEEExxxxttttrrrraaaaccccttttPPPPeeeerrrrmmmm, DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr, DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC,
  10.      DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCLLLLiiiimmmmiiiitttt, DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCPPPPaaaatttthhhh, DDDDPPPPSSSSLLLLDDDDUUUU____OOOOrrrrddddeeeerrrriiiinnnngggg, DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss,
  11.      DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssssZZZZ, DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee, DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM, DDDDPPPPSSSSLLLLDDDDUUUU____SSSSttttoooorrrraaaaggggeeee -
  12.      Parallel sparse unsymmetric solver for linear systems of real equations
  13.  
  14. SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
  15.      Fortran synopsis:
  16.  
  17.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____DDDDEEEESSSSTTTTRRRROOOOYYYY ((((_t_o_k_e_n))))
  18.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n
  19.  
  20.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____EEEEXXXXTTTTRRRRAAAACCCCTTTTPPPPEEEERRRRMMMM ((((_t_o_k_e_n,,,, _p_e_r_m))))
  21.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _p_e_r_m(*)
  22.  
  23.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____FFFFAAAACCCCTTTTOOOORRRR ((((_t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s,,,, _i_n_d_i_c_e_s,,,, _v_a_l_u_e_s))))
  24.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s(*), _i_n_d_i_c_e_s(*)
  25.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _v_a_l_u_e_s(*)
  26.  
  27.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____FFFFAAAACCCCTTTTOOOORRRROOOOOOOOCCCC ((((_t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s,,,, _i_n_d_i_c_e_s,,,, _v_a_l_u_e_s))))
  28.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s(*),,,, _i_n_d_i_c_e_s(*)
  29.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _v_a_l_u_e_s(*)
  30.  
  31.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCLLLLIIIIMMMMIIIITTTT ((((_t_o_k_e_n,,,, _o_o_c_l_i_m_i_t))))
  32.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n
  33.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _o_o_c_l_i_m_i_t
  34.  
  35.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCPPPPAAAATTTTHHHH ((((_t_o_k_e_n,,,, _o_o_c_p_a_t_h))))
  36.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n
  37.           CCCCHHHHAAAARRRRAAAACCCCTTTTEEEERRRR _o_o_c_p_a_t_h(*)
  38.  
  39.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____OOOORRRRDDDDEEEERRRRIIIINNNNGGGG ((((_t_o_k_e_n,,,, _m_e_t_h_o_d))))
  40.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _m_e_t_h_o_d
  41.  
  42.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____PPPPRRRREEEEPPPPRRRROOOOCCCCEEEESSSSSSSS ((((_t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s,,,, _i_n_d_i_c_e_s,,,,
  43.           _n_o_n__z_e_r_o_s,,,, _o_p_s))))
  44.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s(*),,,, _i_n_d_i_c_e_s(*)
  45.           IIIINNNNTTTTEEEEGGGGEEEERRRR****8888 _n_o_n__z_e_r_o_s
  46.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _o_p_s
  47.  
  48.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____PPPPRRRREEEEPPPPRRRROOOOCCCCEEEESSSSSSSSZZZZ ((((_t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s,,,, _i_n_d_i_c_e_s,,,, _m_a_s_k,,,,
  49.           _n_o_n__z_e_r_o_s,,,, _o_p_s))))
  50.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _n,,,, _p_o_i_n_t_e_r_s(*),,,, _i_n_d_i_c_e_s(*),,,, _m_a_s_k(*)
  51.           IIIINNNNTTTTEEEEGGGGEEEERRRR****8888 _n_o_n__z_e_r_o_s
  52.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _o_p_s
  53.  
  54.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____SSSSOOOOLLLLVVVVEEEE ((((_t_o_k_e_n,,,, _x,,,, _b))))
  55.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n
  56.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _x(*),,,, _b(*)
  57.  
  58.  
  59.  
  60.  
  61.  
  62.  
  63.                                                                         PPPPaaaaggggeeee 1111
  64.  
  65.  
  66.  
  67.  
  68.  
  69.  
  70. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  71.  
  72.  
  73.  
  74.           SSSSUUUUBBBBRRRROOOOUUUUTTTTIIIINNNNEEEE DDDDPPPPSSSSLLLLDDDDUUUU____SSSSOOOOLLLLVVVVEEEEMMMM ((((_t_o_k_e_n,,,, _X,,,, _l_d_x,,,, _B,,,, _l_d_b,,,, _n_r_h_s))))
  75.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n,,,, _l_d_x,,,, _l_d_b,,,, _n_r_h_s
  76.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN _X(*),,,, _B(*)
  77.  
  78.           DDDDOOOOUUUUBBBBLLLLEEEE PPPPRRRREEEECCCCIIIISSSSIIIIOOOONNNN FFFFUUUUNNNNCCCCTTTTIIIIOOOONNNN DDDDPPPPSSSSLLLLDDDDUUUU____SSSSTTTTOOOORRRRAAAAGGGGEEEE((((_t_o_k_e_n))))
  79.           IIIINNNNTTTTEEEEGGGGEEEERRRR _t_o_k_e_n
  80.  
  81.      C/C++ synopsis:
  82.  
  83.           ####iiiinnnncccclllluuuuddddeeee <<<<ssssccccssssllll____ssssppppaaaarrrrsssseeee....hhhh>>>>
  84.  
  85.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____DDDDeeeessssttttrrrrooooyyyy ((((iiiinnnntttt _t_o_k_e_n))));;;;
  86.  
  87.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____EEEExxxxttttrrrraaaaccccttttPPPPeeeerrrrmmmm ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _p_e_r_m[[[[]]]]))));;;;
  88.  
  89.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _n,,,, iiiinnnntttt _p_o_i_n_t_e_r_s[[[[]]]],,,, iiiinnnntttt _i_n_d_i_c_e_s[[[[]]]],,,,
  90.           ddddoooouuuubbbblllleeee _v_a_l_u_e_s[[[[]]]]))));;;;
  91.  
  92.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _n,,,, iiiinnnntttt _p_o_i_n_t_e_r_s[[[[]]]],,,, iiiinnnntttt
  93.           _i_n_d_i_c_e_s[[[[]]]],,,, ddddoooouuuubbbblllleeee _v_a_l_u_e_s[[[[]]]]))));;;;
  94.  
  95.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCLLLLiiiimmmmiiiitttt ((((iiiinnnntttt _t_o_k_e_n,,,, ddddoooouuuubbbblllleeee _o_o_c_l_i_m_i_t))));;;;
  96.  
  97.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCPPPPaaaatttthhhh ((((iiiinnnntttt _t_o_k_e_n,,,, cccchhhhaaaarrrr _o_o_c_p_a_t_h[[[[]]]]))));;;;
  98.  
  99.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____OOOOrrrrddddeeeerrrriiiinnnngggg ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _m_e_t_h_o_d))));;;;
  100.  
  101.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _n,,,, iiiinnnntttt _p_o_i_n_t_e_r_s[[[[]]]],,,, iiiinnnntttt
  102.           _i_n_d_i_c_e_s[[[[]]]],,,, lllloooonnnngggg lllloooonnnngggg *_n_o_n__z_e_r_o_s,,,, ddddoooouuuubbbblllleeee *_o_p_s))));;;;
  103.  
  104.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssssZZZZ ((((iiiinnnntttt _t_o_k_e_n,,,, iiiinnnntttt _n,,,, iiiinnnntttt _p_o_i_n_t_e_r_s[[[[]]]],,,, iiiinnnntttt
  105.           _i_n_d_i_c_e_s[[[[]]]],,,, iiiinnnntttt _m_a_s_k[[[[]]]],,,, lllloooonnnngggg lllloooonnnngggg *_n_o_n__z_e_r_o_s,,,, ddddoooouuuubbbblllleeee *_o_p_s))));;;;
  106.  
  107.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee ((((iiiinnnntttt _t_o_k_e_n,,,, ddddoooouuuubbbblllleeee _x[[[[]]]],,,, ddddoooouuuubbbblllleeee _b[[[[]]]]))));;;;
  108.  
  109.           vvvvooooiiiidddd DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM ((((iiiinnnntttt _t_o_k_e_n,,,, ddddoooouuuubbbblllleeee _X[[[[]]]],,,, iiiinnnntttt _l_d_x,,,, ddddoooouuuubbbblllleeee _B[[[[]]]],,,, iiiinnnntttt
  110.           _l_d_b,,,, iiiinnnntttt _n_r_h_s))));;;;
  111.  
  112.           ddddoooouuuubbbblllleeee DDDDPPPPSSSSLLLLDDDDUUUU____SSSSttttoooorrrraaaaggggeeee ((((iiiinnnntttt _t_o_k_e_n))));;;;
  113.  
  114. IIIIMMMMPPPPLLLLEEEEMMMMEEEENNNNTTTTAAAATTTTIIIIOOOONNNN
  115.      These routines are part of the SCSL Scientific Library and can be loaded
  116.      using either the ----llllssssccccssss or the ----llllssssccccssss____mmmmpppp option.  The ----llllssssccccssss____mmmmpppp option
  117.      directs the linker to use the multi-processor version of the library.
  118.  
  119.      When linking to SCSL with ----llllssssccccssss or ----llllssssccccssss____mmmmpppp, the default integer size is
  120.      4 bytes (32 bits). Another version of SCSL is available in which integers
  121.      are 8 bytes (64 bits). This version allows the user access to larger
  122.      memory sizes and helps when porting legacy Cray codes.  It can be loaded
  123.      by using the ----llllssssccccssss____iiii8888 option or the ----llllssssccccssss____iiii8888____mmmmpppp option.  A program may
  124.      use only one of the two versions; 4-byte integer and 8-byte integer
  125.      library calls cannot be mixed.
  126.  
  127.  
  128.  
  129.                                                                         PPPPaaaaggggeeee 2222
  130.  
  131.  
  132.  
  133.  
  134.  
  135.  
  136. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  137.  
  138.  
  139.  
  140.      The C and C++ prototypes shown above are appropriate for the 4-byte
  141.      integer version of SCSL. When using the 8-byte integer version, the
  142.      variables of type iiiinnnntttt become lllloooonnnngggg lllloooonnnngggg and the <<<<ssssccccssssllll____ssssppppaaaarrrrsssseeee____iiii8888....hhhh>>>> header
  143.      file should be included.
  144.  
  145. DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
  146.      DDDDPPPPSSSSLLLLDDDDUUUU solves sparse unsymmetric linear systems of the form _A_x = _b where
  147.      _A is a real _n-by-_n input matrix having symmetric non-zero pattern but
  148.      unsymmetric non-zero values, _b is a real input vector of length _n, and _x
  149.      is an unknown real vector of length _n.
  150.  
  151.      DDDDPPPPSSSSLLLLDDDDUUUU uses a direct method. _A is factored into the following form:
  152.  
  153.           _A = _L _D _U
  154.  
  155.      where _L is a lower triangular matrix with unit diagonal, _D is a diagonal
  156.      matrix, and UUUU is an upper triangular matrix with unit diagonal.
  157.  
  158.      Note that NO PIVOTING FOR STABILITY is performed during factorization.
  159.  
  160.      The DDDDPPPPSSSSLLLLDDDDUUUU library contains five main routines.
  161.  
  162.      *   DDDDPPPPSSSSLLLLDDDDUUUU____OOOOrrrrddddeeeerrrriiiinnnngggg(((()))) allows the user to select one of five possible
  163.          reordering methods to be used in the matrix preprocessing phase.
  164.  
  165.      *   DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss(((()))) performs preprocessing operations on the
  166.          structure of _A (heuristic reordering to reduce fill in _L and _U,
  167.          symbolic factorization, etc.).
  168.  
  169.      *   DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) factors the matrix _A into _L and _U, using the
  170.          previously computed preprocessing data.
  171.  
  172.      *   DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) solves for a vector _x, given an input vector _b.
  173.  
  174.      *   DDDDPPPPSSSSLLLLDDDDUUUU____DDDDeeeessssttttrrrrooooyyyy(((()))) frees all storage associated with the matrix _A
  175.          (including _L, _D, _U, and various data structures computed during
  176.          preprocessing).
  177.  
  178.      The user can call DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) several times after a single call to
  179.      DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss(((()))) to factor multiple matrices with identical non-zero
  180.      structures but different values.  Similarly, the user can call
  181.      DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) several times after a single call to DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) to
  182.      solve for multiple right-hand-sides.  Also, the user can call
  183.      DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) to solve for multiple right-hand-sides all stored in a
  184.      single array.
  185.  
  186.    SSSSppppaaaarrrrsssseeee MMMMaaaattttrrrriiiixxxx FFFFoooorrrrmmmmaaaatttt
  187.      Sparse matrix _A must be input to DDDDPPPPSSSSLLLLDDDDUUUU in Harwell-Boeing format (also
  188.      known as Compressed Column Storage format).
  189.  
  190.  
  191.  
  192.  
  193.  
  194.  
  195.                                                                         PPPPaaaaggggeeee 3333
  196.  
  197.  
  198.  
  199.  
  200.  
  201.  
  202. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  203.  
  204.  
  205.  
  206.      The matrix is held in three arrays: _p_o_i_n_t_e_r_s, _i_n_d_i_c_e_s, and _v_a_l_u_e_s.  The
  207.      _i_n_d_i_c_e_s array contains the row indices of the non-zeros in _A. The _v_a_l_u_e_s
  208.      array holds the corresponding non-zero values. The _p_o_i_n_t_e_r_s array
  209.      contains the index in _i_n_d_i_c_e_s for the first non-zero in each column of _A.
  210.      Thus, the row indices for the non-zeros in column _i can be found in
  211.      locations _i_n_d_i_c_e_s[[[[_p_o_i_n_t_e_r_s[[[[_i]]]]]]]] through _i_n_d_i_c_e_s[[[[_p_o_i_n_t_e_r_s[[[[_i+1]]]]-1]]]]. The
  212.      corresponding values can be found in location _v_a_l_u_e_s[[[[_p_o_i_n_t_e_r_s[[[[_i]]]]]]]] through
  213.      _v_a_l_u_e_s[[[[_p_o_i_n_t_e_r_s[[[[_i+1]]]]-1]]]].
  214.  
  215.      DDDDPPPPSSSSLLLLDDDDUUUU imposes one constraint on the representation of the _A matrix. The
  216.      non-zeros within each column must appear in order of increasing row
  217.      number.
  218.  
  219.      In the following example, the unsymmetric matrix
  220.  
  221.      1.0 0.0 5.0 0.0
  222.      0.0 3.0 0.0 8.0
  223.      2.0 0.0 7.0 0.0
  224.      0.0 4.0 0.0 9.0
  225.  
  226.  
  227.      would be represented in FORTRAN as follows:
  228.  
  229.       INTEGER pointers(5), indices(8), i
  230.       DOUBLE PRECISION values(8)
  231.       DATA (pointers(i), i = 1, 5) / 1, 3, 5, 7, 9 /
  232.       DATA (indices(i),  i = 1, 8) / 1, 3, 2, 4, 1, 3, 2, 4 /
  233.       DATA (values(i),   i = 1, 8) / 1.0, 2.0, 3.0, 4.0, 5.0,
  234.      &                               7.0, 8.0, 9.0 /
  235.  
  236.  
  237.      Zero-based indexing is used in C, so the pointers, indices, and values
  238.      arrays would contain the following:
  239.  
  240.      int pointers[]  = {0, 2, 4, 6, 8};
  241.      int indices[]   = {0, 2, 1, 3, 0, 2, 1, 3};
  242.      double values[] = {1.0, 2.0, 3.0, 4.0, 5.0, 7.0, 8.0, 9.0};
  243.  
  244.  
  245.    OOOOrrrrddddeeeerrrriiiinnnngggg MMMMeeeetttthhhhooooddddssss
  246.      The DDDDPPPPSSSSLLLLDDDDUUUU____OOOOrrrrddddeeeerrrriiiinnnngggg((((_t_o_k_e_n,,,, _m_e_t_h_o_d)))) routine allows the user to change the
  247.      ordering method used to pre-order the matrix before factorization.  This
  248.      routine must be called before calling DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss(((()))). Five options
  249.      are currently available for the method parameter:
  250.  
  251.      *   Method 0 performs no pre-ordering
  252.  
  253.      *   Method 1 performs Approximate Minimum Fill ordering
  254.  
  255.      *   Method 2 performs a single nested dissection ordering (default).
  256.          This method is often called "Extreme matrix ordering".
  257.  
  258.  
  259.  
  260.  
  261.                                                                         PPPPaaaaggggeeee 4444
  262.  
  263.  
  264.  
  265.  
  266.  
  267.  
  268. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  269.  
  270.  
  271.  
  272.      *   Method 3 performs multiple nested dissection orderings (in parallel)
  273.  
  274.      *   Method 4 performs multiple nested dissection (the same as in Method
  275.          3), but it uses a feedback file to "learn" from the previous solves
  276.          of the same matrix structure and it performs more orderings. The
  277.          multiple nested dissection technique of Methods 3 and 4 is also
  278.          referred to as "Extreme2 matrix ordering".
  279.  
  280.      Method 2 is significantly more expensive than Method 1, but it usually
  281.      produces significantly better orderings.  Method 3 is especially
  282.      effective on multi-processor systems.  It computes OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS (where
  283.      OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS is an environment variable indicating the number of
  284.      processors to be used for parallel computation) matrix orderings using
  285.      different starting points for the algorithm and uses the ordering that
  286.      will lead to the fewest floating-point operations to factorize the
  287.      matrix.
  288.  
  289.      Method 4 is useful only when the same non-zero structure is used for
  290.      multiple solves.  Method 4 keeps a record in a "feedback" file of a
  291.      signature for non-zero structures for a maximum of 200 matrices and of
  292.      the starting point that was saved from a previous solve for that
  293.      structure.  In the next Method 4 ordering for that non-zero structure,
  294.      that best starting point and 2222 **** OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS ---- 1111 new ones generate
  295.      orderings.  The best ordering is used.  In this way, the quality of
  296.      orderings stay the same or improve over time.
  297.  
  298.      Methods 3 and 4 typically take more time for the matrix preprocessing
  299.      than the default.  However, on large systems or on repeated
  300.      factorizations, significant overall speedups (1.1X to 2X) can be obtained
  301.      compared to Method 2.
  302.  
  303.    EEEExxxxttttrrrraaaaccccttttiiiinnnngggg tttthhhheeee ppppeeeerrrrmmmmuuuuttttaaaattttiiiioooonnnn vvvveeeeccccttttoooorrrr
  304.      Unless ordering Method 0 is used, DDDDPPPPSSSSLLLLDDDDUUUU applies a symmetric permutation
  305.      to matrix A before the factorization step; the resulting permuted matrix
  306.      generally has significantly less fill-in than the original matrix.  The
  307.      user can obtain the permutation matrix associated with a given token by
  308.      calling DDDDPPPPSSSSLLLLDDDDUUUU____EEEExxxxttttrrrraaaaccccttttPPPPeeeerrrrmmmm((((_t_o_k_e_n,,,, _p_e_r_m)))). The permutation is returned as
  309.      an integer array of length _n, with 1111 <<<<==== ppppeeeerrrrmmmm((((iiii)))) <<<<==== nnnn (0000 <<<<==== ppppeeeerrrrmmmm[[[[iiii]]]] <<<< nnnn
  310.      for C code).
  311.  
  312.      A value of _k for _p_e_r_m(_i) implies that node _k in the original ordering is
  313.      node _i in the new ordering.
  314.  
  315.    MMMMaaaattttrrrriiiicccceeeessss wwwwiiiitttthhhh zzzzeeeerrrroooossss oooonnnn tttthhhheeee ddddiiiiaaaaggggoooonnnnaaaallll
  316.      As noted above, no pivoting is done for stability during factorization;
  317.      when zero or near-zero pivots are encountered, DDDDPPPPSSSSLLLLDDDDUUUU usually fails. In
  318.      these cases, it may be possible to use DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssssZZZZ(((()))) to obtain a
  319.      slightly different, but stable, ordering.  The user provides an
  320.      additional integer array, _m_a_s_k, as an argument to DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssssZZZZ(((()))).
  321.      If _m_a_s_k(_i)====0000, then DDDDPPPPSSSSLLLLDDDDUUUU will attempt to maximize the diagonal element
  322.      ||||AAAAiiiiiiii||||.
  323.  
  324.  
  325.  
  326.  
  327.                                                                         PPPPaaaaggggeeee 5555
  328.  
  329.  
  330.  
  331.  
  332.  
  333.  
  334. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  335.  
  336.  
  337.  
  338.    MMMMeeeemmmmoooorrrryyyy uuuussssaaaaggggeeee
  339.      The returned value of DDDDPPPPSSSSLLLLDDDDUUUU____SSSSttttoooorrrraaaaggggeeee(((()))) is an estimate of the amount of
  340.      storage required (in millions of bytes) by the solver's data structures
  341.      for a given matrix system.
  342.  
  343.    OOOOuuuutttt----ooooffff----ccccoooorrrreeee FFFFaaaaccccttttoooorrrriiiizzzzaaaattttiiiioooonnnn
  344.      The storage associated with the factor can be managed in two ways.  The
  345.      DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) routine allocates memory for the factor and manages it
  346.      internally, releasing it only when DDDDPPPPSSSSLLLLDDDDUUUU____DDDDeeeessssttttrrrrooooyyyy(((()))) is called.  The
  347.      alternative is to do out-of-core factorization by calling
  348.      DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC(((()))). This routine uses a small amount of in-core memory,
  349.      placing the remainder of the factor matrix on disk as it is computed.
  350.      The user can call DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCPPPPaaaatttthhhh(((()))) to indicate the directory in which the
  351.      factor file should be written, and DDDDPPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCLLLLiiiimmmmiiiitttt(((()))) to indicate how much
  352.      memory to use to hold portions of the factor matrix in-core.  More in-
  353.      core memory generally leads to less disk I/O and higher performance
  354.      during the factorization.  The only required change is to move from in-
  355.      core factorization to out-of-core factorization is the change from
  356.      DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) to DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC(((()))).  The other routines
  357.      (DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))), DDDDPPPPSSSSLLLLDDDDUUUU____DDDDeeeessssttttrrrrooooyyyy(((()))), etc.) handle out-of-core factors
  358.      transparently.  Note that DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC(((()))) and subsequent calls to
  359.      DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) are not parallelized (but calls to DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) are
  360.      parallelized, as discussed below).
  361.  
  362.    MMMMuuuullllttttiiiipppplllleeee RRRRiiiigggghhhhtttt----HHHHaaaannnndddd----SSSSiiiiddddeeeessss
  363.      DDDDPPPPSSSSLLLLDDDDUUUU can solve for large numbers of right-hand-sides with one call to
  364.      DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))).  It solves these right hand sides in parallel, with each
  365.      processor solving up to four at a time for in-core systems and up to
  366.      PPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCBBBBLLLLKKKK at a time for out-of-core systems, where PPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCBBBBLLLLKKKK is an
  367.      environment variable whose default value is 1.
  368.  
  369.    IIIInnnn----ppppllllaaaacccceeee SSSSoooollllvvvveeeessss
  370.      Both DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) and DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) allow the solution vector(s) to
  371.      overwrite the right-hand-side(s) when identical vectors or matrices are
  372.      supplied to these routines.  For example,
  373.  
  374.       CALL DPSLDU_SOLVE(token, b, b)
  375.  
  376.  
  377.      takes the right-hand-side input from _b and also returns the solution
  378.      vector in _b.  When this option is used with DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))), the leading
  379.      dimensions for the solution and right-hand-side matrices must agree.  The
  380.      amount of memory actually saved by performing an in-place solve depends
  381.      on the number of right-hand-sides used.  For a single right-hand-side,
  382.      there are no net savings versus an out-of-place solve because a temporary
  383.      copy of the input vector is made internally.  For multiple right-hand-
  384.      sides the memory overhead decreases as the ratio of right-hand-sides to
  385.      processors used increases.
  386.  
  387.    AAAArrrrgggguuuummmmeeeennnnttttssss
  388.      These routines have the following arguments:
  389.  
  390.  
  391.  
  392.  
  393.                                                                         PPPPaaaaggggeeee 6666
  394.  
  395.  
  396.  
  397.  
  398.  
  399.  
  400. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  401.  
  402.  
  403.  
  404.      _t_o_k_e_n     (input) DDDDPPPPSSSSLLLLDDDDUUUU can handle multiple matrices simultaneously. The
  405.                _t_o_k_e_n distinguishes between active matrices.  The _t_o_k_e_n passed
  406.                to DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))) must match the _t_o_k_e_n used in some previous
  407.                call to DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssss(((()))).  Similarly, the _t_o_k_e_n passed to
  408.                DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) must match the _t_o_k_e_n used in some previous call
  409.                to DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrr(((()))).  0000 <<<<==== _t_o_k_e_n <<<<==== 11119999....
  410.  
  411.      _m_e_t_h_o_d    (input) An integer specifying the ordering method used during
  412.                preprocessing.  0000 <<<<==== _m_e_t_h_o_d <<<<==== 4444....
  413.  
  414.      _n         (input) The number of rows and columns in the matrix _A.  _n >>>>====
  415.                0000....
  416.  
  417.      _p_o_i_n_t_e_r_s, _i_n_d_i_c_e_s, _v_a_l_u_e_s
  418.                (input) The _p_o_i_n_t_e_r_s and _i_n_d_i_c_e_s arrays store the non-zero
  419.                structure of sparse input matrix _A in Harwell-Boeing or
  420.                Compressed Sparse Column (CSC) format.
  421.  
  422.                The _p_o_i_n_t_e_r_s array stores _n+1 integers, where _p_o_i_n_t_e_r_s[[[[_i]]]] gives
  423.                the index in _i_n_d_i_c_e_s of the first non-zero in column _i of _A.
  424.                The _i_n_d_i_c_e_s array stores the row indices of the non-zeros in _A.
  425.                The _v_a_l_u_e_s array stores the non-zero values in the matrix _A.
  426.  
  427.      _n_o_n__z_e_r_o_s (output) The number of non-zero values in _L and _U.
  428.  
  429.      _o_p_s       (output) The number of floating-point operations required to
  430.                factor _A.
  431.  
  432.      _m_a_s_k      (input) An integer array of length _n used in
  433.                DDDDPPPPSSSSLLLLDDDDUUUU____PPPPrrrreeeepppprrrroooocccceeeessssssssZZZZ(((()))).  If _m_a_s_k(_i) ==== 0000, then node _i of matrix A
  434.                is ordered after all of its neighbors in an attempt to avoid a
  435.                zero pivot.
  436.  
  437.      _b         (input) The right-hand-side vector in a DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) call.
  438.  
  439.      _x         (output) The solution vector in a DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeee(((()))) call.
  440.  
  441.      _n_r_h_s      (input) The number of right-hand side vectors present in a
  442.                DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) call.
  443.  
  444.      _B         (input) The right-hand-side matrix in a DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) call.
  445.                Must be stored in column-major order.
  446.  
  447.      _l_d_b       (input) The leading dimension of matrix _B. _l_d_b >>>>==== _n.
  448.  
  449.      _X         (output) The solution matrix in a DDDDPPPPSSSSLLLLDDDDUUUU____SSSSoooollllvvvveeeeMMMM(((()))) call. Must be
  450.                stored in column-major order.
  451.  
  452.      _l_d_x       (input) The leading dimension of matrix _X. _l_d_x >>>>==== _n.
  453.  
  454.  
  455.  
  456.  
  457.  
  458.  
  459.                                                                         PPPPaaaaggggeeee 7777
  460.  
  461.  
  462.  
  463.  
  464.  
  465.  
  466. DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))                                                          DDDDPPPPSSSSLLLLDDDDUUUU((((3333SSSS))))
  467.  
  468.  
  469.  
  470.      _o_o_c_p_a_t_h   (input) A character array/string with a path to the directory
  471.                where the temporary out-of-core factor files should be stored.
  472.                If this path is on a striped (or raid-0) file system, the
  473.                performance of the out-of-core solves can be considerably
  474.                improved.  The default path is ////uuuussssrrrr////ttttmmmmpppp.
  475.  
  476.      _o_o_c_l_i_m_i_t  (input) A double precision number indicating the number of
  477.                Mbytes of random access memory that should be used for factor
  478.                storage during a call to DDDDPPPPSSSSLLLLDDDDUUUU____FFFFaaaaccccttttoooorrrrOOOOOOOOCCCC(((()))). Note that there
  479.                are many other arrays used besides those directly used to store
  480.                the factorization, so total RAM usage by the solve will exceed
  481.                this number.  The default is 64 MB.
  482.  
  483.      _p_e_r_m      (output) An integer array of length _n containing the
  484.                permutation used to reorder matrix A.
  485.  
  486. EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS
  487.      Two environment variables can affect the operation of ordering methods 3
  488.      and 4.  SSSSPPPPAAAARRRRSSSSEEEE____NNNNUUUUMMMM____OOOORRRRDDDDEEEERRRRSSSS can be used to change the number of orderings
  489.      performed from the default of OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS for Method 3 and
  490.      (2*OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS) for Method 4.  SSSSPPPPAAAARRRRSSSSEEEE____FFFFEEEEEEEEDDDDBBBBAAAACCCCKKKK____FFFFIIIILLLLEEEE can be set to the
  491.      path and file name where the feedback information will be kept;
  492.      otherwise, the default feedback file is $$$$HHHHOOOOMMMMEEEE////....ssssppppaaaarrrrsssseeeeFFFFeeeeeeeeddddbbbbaaaacccckkkk.  This file
  493.      will be less than 5K bytes.
  494.  
  495.      The environment variable OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS determines the number of
  496.      processors that are used for the numerical factorization and solve
  497.      phases.  Out-of-core solves can be performed in groups of PPPPSSSSLLLLDDDDUUUU____OOOOOOOOCCCCBBBBLLLLKKKK
  498.      right-hand-sides per processor.  Setting the environment variable
  499.      PPPPSSSSLLLLDDDDUUUU____VVVVEEEERRRRBBBBOOOOSSSSEEEE causes DDDDPPPPSSSSLLLLDDDDUUUU to output information about the
  500.      factorization.
  501.  
  502. NNNNOOOOTTTTEEEESSSS
  503.      These routines are optimized and parallelized for the SGI R8000 and
  504.      R1x000 platforms.
  505.  
  506. SSSSEEEEEEEE AAAALLLLSSSSOOOO
  507.      IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL(3S), IIIINNNNTTTTRRRROOOO____SSSSOOOOLLLLVVVVEEEERRRRSSSS(3S), DDDDPPPPSSSSLLLLDDDDLLLLTTTT(3S), ZZZZPPPPSSSSLLLLDDDDLLLLTTTT(3S), ZZZZPPPPSSSSLLLLDDDDUUUU(3S)
  508.  
  509.  
  510.  
  511.  
  512.  
  513.  
  514.  
  515.  
  516.  
  517.  
  518.  
  519.  
  520.  
  521.  
  522.                                                                         PPPPaaaaggggeeee 8888
  523.  
  524.  
  525.  
  526.  
  527.  
  528.  
  529.